14. Translate Performance into Clinical Utility Exercise

Translate Performance into Clinical Utility Exercise

The output of the last layer of your CNN will output a probability that an image belongs to a given class. The last step in assessing the performance of your model is choosing the appropriate cut-off threshold for this probability value such that the model behaves in a way that is clinically optimal. Changing this threshold will change the true positive, false positive, false negative, and true negative rates, which we learned about in Lesson 1. This will, in turn, change the precision and recall of our model. Precision and recall are important concepts in clinical testing, and usually, one is optimized at the expense of another.

In this exercise, you'll be given a dataframe with ground truth labels as well as probabilities output by an algorithm that you developed to classify images as having a malignant tumor or not. Your job is to generate a Precision-Recall curve as well as an optional ROC AUC curve with the data in this dataframe.

Once you create these curves, choose two different thresholds: one that favors precision and one that favors recall. Use these thresholds to calculate two separate F1 scores. Then, also calculate the accuracy of your algorithm using these two different thresholds and think about why accuracy is or is not a good choice of performance statistic for your data.

Code

If you need a code on the https://github.com/udacity.